Testing by VyasGuru · Pull Request #333 · RunanywhereAI/runanywhere-sdks

VyasGuru · 2026-02-05T15:55:13Z

Changes

changed the resampling code to be more vectorisation friendly: using restrict as pointer aliasing is guaaranteed to not happen, simplify interpolation so that it can use fma, split into fast and safe loop to reduce checks.
reserving memory for full_text and result segments

Results

Tested against old code.
1.5 - 1.8x faster resampling
the memory reserves reduce heap thrashing as the += will trigger fewer allocations

Note

the amount of memory reserved is currently chosen as a "reasonable " amount. probably should check for a "better" reserving heuristic
there is a tiny drift in output of resampling due to the nature of floats. very miniscule

Important

Optimizes resampling and memory management in whispercpp_backend.cpp for improved performance and reduced heap thrashing.

Performance Optimization:
- Rewrites resample_to_16khz() in whispercpp_backend.cpp for vectorization, using restrict pointers and splitting loops to reduce checks.
- Pre-allocates memory for full_text and result.segments in transcribe_internal() to reduce heap thrashing.
Testing and Results:
- Tested against old code, achieving 1.5 - 1.8x faster resampling.
- Memory pre-allocation reduces heap thrashing.
Notes:
- Current memory reservation is a reasonable estimate; further heuristic improvements are possible.
- Minor drift in resampling output due to floating-point precision.

^{This description was created by}^{for f901cf0. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

Refactor
- Optimized audio transcription pipeline for reduced memory use and faster, more reliable processing.
- Improved audio resampling with early-exit for matching rates, a fast path for common ratios, and a robust interpolation path for arbitrary rates, increasing stability and accuracy.

Greptile Overview

Greptile Summary

This PR optimizes the WhisperCPP backend for better performance through vectorization-friendly code and memory pre-allocation strategies.

Key improvements:

Resampling algorithm refactored to use restrict pointers and FMA-friendly interpolation (val0 + frac * (val1 - val0))
Added fast path for integer ratio resampling (e.g., 48kHz → 16kHz uses simple stride-based decimation)
Split resampling into fast loop and safe boundary-checking loop to reduce conditional branches
Memory pre-allocation for full_text, segments, and word_timings vectors to reduce heap thrashing
Changed from push_back to emplace_back with reference to avoid unnecessary copies

Minor issues found:

Indentation inconsistency in the integer ratio fast path (lines 566-573)
Trailing whitespace on lines 277 and 564
The fallback loop (lines 597-609) uses push_back after calculating exact size, which could use indexed writes for better performance

Confidence Score: 4/5

Safe to merge with minor formatting fixes recommended
The changes are well-tested (per PR description: 1.5-1.8x faster), mathematically sound optimizations for audio resampling and memory management. Only issues are cosmetic (indentation/whitespace) that don't affect functionality
No files require special attention - formatting issues are minor

Important Files Changed

Filename	Overview
sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp	Optimized resampling algorithm with vectorization improvements and memory reservation; contains minor formatting issues (indentation, whitespace)

Sequence Diagram

sequenceDiagram
    participant Client
    participant WhisperCppSTT
    participant VectorMemory as Vector (Pre-allocated)
    participant ResampleFn as resample_to_16khz
    
    Client->>WhisperCppSTT: transcribe_internal(audio, config)
    
    Note over WhisperCppSTT: Process audio segments
    WhisperCppSTT->>VectorMemory: full_text.reserve(n_segments * 64)
    WhisperCppSTT->>VectorMemory: segments.reserve(n_segments)
    
    alt word_timestamps enabled
        WhisperCppSTT->>VectorMemory: word_timings.reserve(n_segments * 15)
    end
    
    loop for each segment
        WhisperCppSTT->>VectorMemory: segments.emplace_back()
        Note over WhisperCppSTT: Direct construction in place
        
        alt word_timestamps
            loop for each token
                WhisperCppSTT->>VectorMemory: word_timings.emplace_back()
            end
        end
    end
    
    Client->>WhisperCppSTT: needs resampling
    WhisperCppSTT->>ResampleFn: resample_to_16khz(samples, source_rate)
    
    alt integer ratio (e.g., 48kHz -> 16kHz)
        ResampleFn->>ResampleFn: Fast path: stride-based decimation
        ResampleFn-->>WhisperCppSTT: return output
    else non-integer ratio
        ResampleFn->>VectorMemory: output.reserve(output_size)
        ResampleFn->>ResampleFn: Fast loop with restrict pointer
        Note over ResampleFn: FMA-friendly interpolation:<br/>val0 + frac * (val1 - val0)
        ResampleFn->>ResampleFn: Safe loop for remaining samples
        ResampleFn-->>WhisperCppSTT: return output
    end
    
    WhisperCppSTT-->>Client: return STTResult

_{(4/5) You can add custom instructions or style guidelines for the agent here!}

opti

…into testing

coderabbitai · 2026-02-05T15:55:24Z

📝 Walkthrough

Walkthrough

Reworks WhisperCPP backend transcription and resampling: replaces push_back with emplace_back and in-place initialization for segments/word timings, adds container reservations, and implements three resampling paths (early-exit, stride-based for exact multiples, and general interpolation) with tightened bounds and safety checks.

Changes

Cohort / File(s)	Summary
WhisperCPP Backend `sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp`	Switched segment/word-timing construction to in-place `emplace_back` with reference-based field assignment and upfront `reserve`; removed redundant pushes; added three resampling strategies (no-op early exit, optimized stride sampling for integer multiples, and interpolation for arbitrary rates); refactored loops and tightened safety/bounds checks while preserving cancellation and streaming paths.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I nudge the buffers, emplace with care,

stride and interpolate through crisp, cool air,
Early exits hum, while boundaries bind,
Segments stitched neatly — a rabbit's delight to find.

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description provides detailed technical information about changes, testing results, and performance improvements, but does not follow the required template structure.	Fill out the required description template by adding sections for Type of Change, Testing checklist items, Platform-Specific Testing, Labels, and Checklist items as specified in the repository guidelines.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Testing' is vague and generic, using a non-descriptive term that does not convey meaningful information about the changeset.	Use a more descriptive title that reflects the main change, such as 'Optimize resampling and memory allocation in WhisperCPP backend' or 'Improve performance of audio resampling with vectorization and pre-allocation'.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to f901cf0 in 14 seconds. Click for details.

Reviewed 121 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_hzUsXJ9MGf04l6Kz

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

greptile-apps

_{1 file reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-05T16:02:54Z

sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp

+    if (source_rate % WHISPER_SAMPLE_RATE == 0) {
+    int stride = source_rate / WHISPER_SAMPLE_RATE;
+    output.resize(samples.size() / stride);
+    for (size_t i = 0; i < output.size(); ++i) {
+        output[i] = samples[i * stride];
+    }
+    return output;
+}


Indentation inconsistent with surrounding code - missing 4 spaces

Suggested change

if (source_rate % WHISPER_SAMPLE_RATE == 0) {

int stride = source_rate / WHISPER_SAMPLE_RATE;

output.resize(samples.size() / stride);

for (size_t i = 0; i < output.size(); ++i) {

output[i] = samples[i * stride];

}

return output;

}

if (source_rate % WHISPER_SAMPLE_RATE == 0) {

int stride = source_rate / WHISPER_SAMPLE_RATE;

output.resize(samples.size() / stride);

for (size_t i = 0; i < output.size(); ++i) {

output[i] = samples[i * stride];

}

return output;

}

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp Line: 566:573 Comment: Indentation inconsistent with surrounding code - missing 4 spaces ```suggestion if (source_rate % WHISPER_SAMPLE_RATE == 0) { int stride = source_rate / WHISPER_SAMPLE_RATE; output.resize(samples.size() / stride); for (size_t i = 0; i < output.size(); ++i) { output[i] = samples[i * stride]; } return output; } ``` Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-05T16:02:55Z

sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp

+    const size_t output_size = static_cast<size_t>(samples.size() / step);
+
+    std::vector<float> output;
+


Blank line with trailing whitespace after declaration

Suggested change

std::vector<float> output;

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp Line: 564:564 Comment: Blank line with trailing whitespace after declaration ```suggestion std::vector<float> output; ``` Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-05T16:02:56Z

sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp


    const int n_segments = whisper_full_n_segments(ctx_);
    std::string full_text;
+    full_text.reserve(n_segments * 64); 


Trailing whitespace after reserve call

Suggested change

full_text.reserve(n_segments * 64);

full_text.reserve(n_segments * 64);

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp Line: 277:277 Comment: Trailing whitespace after reserve call ```suggestion full_text.reserve(n_segments * 64); ``` Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-05T16:02:56Z

sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp

+    while (output.size() < output_size) {
+        size_t idx0 = static_cast<size_t>(pos);
+        if (idx0 >= src_size) idx0 = src_size - 1;

-    std::vector<float> output(output_size);
+        size_t idx1 = (idx0 + 1 < src_size) ? idx0 + 1 : src_size - 1;

-    for (size_t i = 0; i < output_size; ++i) {
-        const double src_idx = i / ratio;
-        const size_t idx0 = static_cast<size_t>(src_idx);
-        const size_t idx1 = std::min(idx0 + 1, samples.size() - 1);
-        const double frac = src_idx - idx0;
+        double frac = pos - static_cast<double>(idx0);
+        float val0 = src_ptr[idx0];
+        float val1 = src_ptr[idx1];

-        output[i] = static_cast<float>(samples[idx0] * (1.0 - frac) + samples[idx1] * frac);
+        output.push_back(val0 + static_cast<float>(frac) * (val1 - val0));
+        pos += step;
    }


Looping with push_back after exact size calculation reduces performance benefit of vectorization optimization. Consider using indexed writes with pre-sized vector since output_size is known upfront

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp Line: 597:609 Comment: Looping with `push_back` after exact size calculation reduces performance benefit of vectorization optimization. Consider using indexed writes with pre-sized vector since `output_size` is known upfront How can I resolve this? If you propose a fix, please make it concise.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp`:
- Around line 560-608: The resampling code can return an empty vector when
output_size computes to 0 for very small input chunks; clamp output_size to at
least 1 (e.g., size_t output_size = std::max(static_cast<size_t>(1),
static_cast<size_t>(samples.size() / step));) and mirror that clamping in the
integer-stride fast path (ensure stride_out_size = std::max<size_t>(1,
samples.size() / stride) and use that for output.resize and indexing), and
ensure subsequent loops use the clamped output_size/safe_output_limit so no
audio is dropped nor out-of-range indexing occurs (adjust pos/init or bounds
checks as needed).

sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp

VyasGuru · 2026-02-05T16:08:08Z

will fix the indentation and push soon.

shubhammalhotra28 · 2026-02-06T07:25:33Z

We;re not using whisper c++ as of now for the voice use cases and it's just using the sherpa onnx.

shubhammalhotra28 · 2026-02-06T07:25:53Z

but will just merge - since itt's fine to merge.
Thanks @VyasGuru

* Optimise resampling and heap thrashing opti * improved even more for integer ratios * Update whispercpp_backend.cpp

WatchFace: - Time truly centered (Arrangement.Center) — no more top-heavy layout - AI status dot moved to top-center, battery to subtle top-right (8sp) - MIC button: 36dp → 28dp, camera: 28dp → 22dp — less dominant - Bottom action row with spacedBy(16.dp) instead of Spacer hacks TranscriptionScreen: - Back button: dark background circle + arrow unicode (←) — visible on all bezels - Title centered via Box with Alignment.Center (not SpaceBetween) - Divider inset extra 4dp on watch to avoid round edge - Footer: entry count hidden on watch, "Clear all" centered, 8dp bottom padding CameraOverlay: - Close button: top-center on watch (not top-right corner) — avoids bezel clip - Brighter background (RunanywhereAI#333) for close button visibility - Capture/? buttons: bottom padding 16dp → 24dp — well within bezel - Preview: 120dp → 100dp — more breathing room around edges Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

VyasGuru added 2 commits February 5, 2026 20:59

Optimise resampling and heap thrashing

8caec34

opti

Merge branch 'testing' of https://github.com/VyasGuru/runanywhere-sdks …

2bea5e4

…into testing

improved even more for integer ratios

f901cf0

VyasGuru marked this pull request as ready for review February 5, 2026 15:59

ellipsis-dev bot reviewed Feb 5, 2026

View reviewed changes

greptile-apps bot reviewed Feb 5, 2026

View reviewed changes

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp Show resolved Hide resolved

Update whispercpp_backend.cpp

3367adb

shubhammalhotra28 merged commit f5b25a9 into RunanywhereAI:main Feb 6, 2026
7 of 11 checks passed

ManthanNimodiya pushed a commit to ManthanNimodiya/runanywhere-sdks that referenced this pull request Feb 23, 2026

Testing (RunanywhereAI#333)

b746643

* Optimise resampling and heap thrashing opti * improved even more for integer ratios * Update whispercpp_backend.cpp

		const size_t output_size = static_cast<size_t>(samples.size() / step);

		std::vector<float> output;

	full_text.reserve(n_segments * 64);
	full_text.reserve(n_segments * 64);

Conversation

VyasGuru commented Feb 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Results

Note

Summary by CodeRabbit

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

VyasGuru commented Feb 5, 2026

Uh oh!

shubhammalhotra28 commented Feb 6, 2026

Uh oh!

shubhammalhotra28 commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

VyasGuru commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 5, 2026 •

edited

Loading